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1.  INTRODUCTION 


Many  problems  in  control  system  design  or  economic  system  modelling 
naturally  arise  in  the  multiple  decision-makers  framework.  The  study  of  this 
type  of  problems  is  called  game  theory.  The  decision-makers  are  considered 
as  players  striving  to  optimize  their  respective  performance  indices  under 
some  a  priori  determined  ground  rules. 

Various  types  of  rules  (called  strategies)  have  been  developed. 

Some  have  only  a  single  performance  index;  for  example,  team  problem  (players 
optimize  the  same  index  but  possible  under  different  information)  [1],  and  2- 
person  zero-sum  game  (the  performance  index  is  the  cost  of  one  player  and  the 
payoff  of  the  other)  [2].  Some  games  have  multiple  criteria;  for  example, 
the  2 -person  nonzero  -stun  game  under  the  Nash  equilibrium  concept  (the  players 
optimize  their  respective  performance  simultaneously)  [ 3]  ,  or  under  the 
Stackelberg  equilibrium  concept  (the  leading  player  optimizes  his  performance 
index  knowing  how  the  passive  player  will  react)  [4] . 

Game  theory,  though  can  be  considered  as  a  generalization  of  the 
single  person,  single-criterion  control  theory,  is  a  great  deal  more  complex. 
In  particular,  for  the  dynamic  Stackelberg  Game,  even  in  the  seemingly  simple 
case  of  linear-quadratic  problem,  it  is  extremely  difficult  to  obtain  any 
analytic  solution.  Therefore,  a  nudified  scheme,  the  Restricted  Stackelberg 
Problem  (RSP)  ([5],  (6]),  is  proposed.  This  is  a  Stackelberg  game  with  a 

specific  information  structure  which  allows  the  leader  to  announce  his 
strategy  first  but  to  act  only  after  the  follower  has  acted.  By  choosing 


different  representations  of  a  given  strategy,  the  leader  can  manipulate 
the  follower  in  various  ways.  In  particular,  the  leader  may  be  able  to  force 
the  follower  to  act  as  if  he  is  also  mimimizing  the  leader's  cost.  In  RSP, 
we  also  restrict  attention  only  to  those  Stackelberg  solutions  which  attain 
the  lower  bound  of  the  leader's  cost  (the  team  cost).  The  focus  of  this 
report  is  on  RSP  for  a  special  class  of  problem,  namely,  discrete- time, 
finite-horizon,  linear-quadratic-gaussian. 

RSP,  if  solvable,  is  a  powerful  modelling  tool.  It  can  be  readily 
applied  to  many  economic  and  control  problems  where  the  hierarchy  of  opera¬ 
tion  clearly  exists  or  is  desired,  and  it  is  analytically  tractable.  In 
economics,  the  government  -  industry  -  consumers  hierarchy  can  be  naturally 
posed  as  a  tri-level  RSP.  The  government  announces  its  regulation  policy 
first,  the  industry  then  stipulates  a  pricing  strategy  based  on  the  announced 
regulation.  The  consumers  act  first  by  making  certain  amount  of  purchase  from 
the  industry  based  on  the  price  of  the  product  or  service  the  industry  supplies. 
In  engineering,  any  large  scale  system  wherein  a  single  centralized  controller 
is  impractical  can  be  potentially  modelled  in  the  RSP  framework  with  layers  of 
decentralized  controllers  with  different  priority  of  operation. 

The  investigation  in  this  report  is  carried  out  using  the  dynamic- to- 
static  conversion,  which  collapses  the  dynamic  evolution  of  a  variable  (over 

finite  horizon)  into  a  single  vector.  A  dynamic  problem  can  then  be  converted 
into  the  static  domain,  and  the  results  proven  on  this  domain  can  be  trans¬ 
ferred  back  to  the  dynamic  domain.  One  feature  of  this  technique  is  that  it 
bypasses  a  great  deal  of  algebra  to  make  the  qualitative  features  more 
apparent,  which  is  versatile  in  establishing  the  existence  of  solutions. 


However,  in  doing  so,  it  sacrifices  the  recursiveness  of  the  solutions,  which 
may  be  a  crucial  requirement  in  the  implementation  of  the  solutions. 

Three  classes  of  information  structures  are  considered:  the  deter¬ 
ministic  centralized,  the  deterministic  decentralized,  and  the  stochastic. 

Most  of  the  results  are  obtained  for  the  deterministic  centralized  information 
structure.  Sufficient  conditions  for  existence  of  RSP  solutions  are  derived. 
Some  qualitative  aspects  of  RSP  are  also  examined:  the  dependence  of  solva¬ 
bility  of  RSP  on  the  specific  choice  of  information  and  representation,  the 
stationarity  and  the  convexity  conditions,  the  advantage  of  linear  solutions, 
and  some  interpretation  of  the  given  conditions.  The  decentralized  problem 
is  approached  in  the  same  manner  as  the  centralized  case.  The  results  are 
similar  if  the  initial  data  distribution  is  assumed  known.  The  stochastic 
RSP  with  perfect  state  information  cannot  be  solved  because  of  the  inability 
of  the  leader  to  detect  whether  the  team  solution  is  enforced  or  not.  To 
bypass  this  difficulty,  we  include  both  the  state  and  the  follower's  control 
to  leader's  information.  The  problem  them  becomes  similar  to  the  other  cases. 

In  the  situation  where  the  conditions  mentioned  above  are  not 
satisfied,  the  possibility  of  the  leader  using  a  large  threat  (penalty  to 
follower's  deviation  from  the  team  trajectory)  strategy  to  achieve  his  near¬ 
team  cost  is  considered.  It  is  shown  that  under  certain  mild  conditions,  the 

infinite  threat  can  achieve  the  team  cost  for  the  leader.  It  is,  therefore, 
reasonable  to  ask  the  questions  under  what  conditions  can  the  leader  achieve 
a  cost  arbitrarily  close  to  his  team  cost  using  large  but  finite  threat?  It 
is  shown  that  in  general  the  leader  does  not  possess  such  a  strong 
position  and  the  case  in  which  it  holds  is  a  variety  in  the  parameter  space. 


This  report  is  structured  into  four  sections.  The  definitions 
and  problem  formulation  are  stated  in  Chapter  2.  Chapter  3,  the  main  bulk 
of  the  work,  is  devoted  to  the  various  cases  of  RSP.  The  concluding  section. 
Chapter  4,  summarizes  the  report  and  points  out  some  future  directions. 


2.  PROBLEM  FORMULATION 


2.1.  Introduction 

By  an  abuse  of  language,  we  shall  also  let  RSP  stand  for  the 
equilibrium  strategy  to  be  investigated  in  this  report,  which  is  a  restricted 
version  of  the  Stackelberg  equilibrium  strategy  as  briefly  discussed  in  sec¬ 
tion  1.  Stackelberg  strategy  was  introduced  by  von  Stackelberg  [12]  in  the 
static  setting.  Generalization  to  dynamic  case  was  first  done  in  [4] .  The 
idea  is  that  the  commanding  player  (leader)  announces  his  strategy  at  each 
stage  knowing  how  the  follower  will  react  to  his  strategy.  The  follower  then 
optimizes  his  performance  index  based  on  the  leader's  strategy.  This  equilib¬ 
rium  strategy  concept,  although  very  appealing  in  terms  of  modelling,  is 
difficult  to  solve  analytically  in  general  (in  the  closed  loop  dynamic  case). 
The  difficulty  lies  in  the  fact  that  the  principle  of  optimality  fails  to 
apply  due  to  the  dependence  of  the  closed  loop  strategy  on  the  length  of  the 
horizon.  To  circumvent  this  difficulty  a  restricted  type  of  Stackelberg 
strategy  is  considered  in  [5],  [6].  This  strategy  concept,  RSP,  focuses  on 
the  Stackelberg  pair  that  achieves  the  team  cost  for  the  leader.  The  leader, 
using  the  non-unique  representation  of  his  team  strategy,  adds  on  redundant 
terms  that  have  values  zero  on  the  team  trajectory.  By  choosing  the  appro¬ 
priate  redundancy  (or  the  threat  to  the  follower)  the  leader  may  be  able  to 
force  the  follower  to  act  as  if  he  is  also  optimizing  the  leader's  performance. 

In  this  chapter,  we  state  the  general  definitions  of  Stackelberg, 
Team,  and  Restricted  Stackelberg  problems.  Then  we  examine  some  of  the  past 
highlights  and  show  how  the  present  work  fits  into  the  lines  of  development. 


2.2.  Definitions 


Assume  some  underlying  probability  space  (^,  F,  P)  is  given. 

y 

Let  X  (0):  0  “*  Rn,  w  (k) :  0  -  Ra,  V^k):  fl-*  R  \  k€lo,  1,  —  ,  N-li, 
i  €  ll,2)  ,  be  random  variables  with  respect  to  (0,  F,  P),  whose  statics 

are  assumed  perfectly  known. 

We  consider  a  discrete,  time-varying,  N-stage  dynamic  system 
with  u^  (k)  and  (k)  as  input  commands  and  W  (k)  as  noise  disturbance 
into  stage  k: 

x  (k  +  1)  -  f  (k,  x  (k),  uL  (k),  a,  (k),  W  (k))  k  »  0,  ~,  N-l  (2.] 

At  stage  k,  assume  information  vectors  Z^  (k)  are  given: 

Z±  (k)  -  ZL  (k,  x  (0),  -,  x  <k),  Ul  (0),  -,  n1  (k-1), 

u2  (0),  — ,  u2  (k-1),  y±  (k))  (2.2 

Let  F^  (k)  be  the  (k)-generated  o-algebra. 

We  require  that  u^  (k^  €  (k),  where  (k)  A 

^Yk  1  'k:  R  Yk  (Z^Wjis  (k) -measurable) 

The  control  objective  of  player  i  is  to  find  a  sequence  of 
admissible  controls  according  to  some  equilibrium  solution  concept  based 
on  the  cost  index. 

1  N  1  2  N"L 

Ji  (  t  k.0,’  j-l>  *  E  1  Z  Li>k  (x(k),  (k), 

u2(k),  k)  +  (x(N))  3  (2.3 

We  now  define  the  following  equilibrium  solution  concepts. 
Definition  2.1 

n'  Uj  (k),  u2  (k)j  ka>Q  is  a  closed  loop  Stackelberg  sequence  with  player  1  as 
leader,  player  2  as  follower  if  it  solves 


h 


[u2(k,  iut  (k)^.0  )  ) 


n1  (k)  ^(k)  JL  (  {uL  (k)^, 
k»0, 1,--,N-1 

A  N-l  N-l 

where  [  u2  (k,  t  ux  (k)^,0)lk_0  is 


.  A 

t  u2  (k,  i  uL 


N-l 

<k>W 


N-l  N-l 

arg  Min  J  (l  u  (k)}  ,  iu  (k) 1  ) 

u2(k)€u2Z(k)  1  k  0  2  k"° 


Definition  2.2 

N-i 

l "{  (k),  ^2C  (k) ik-0  is  a  team  solution  pair  for  the  leader  if  it  solves 

Min  J  (u  (0),  — ,  u  (N-l),  u  (0),  u  (N-l)) 

“i  <k)  6  U  (k)  1  2  2 

i  -  1,  2 
k  -  0,  --,  N-l 

Definition  2.3 

Let  Z^  (k,  u2  (k),  u2  (k-1),  u2  (0))  be  some  information  set. 

Then,  (Z^  (k,  «2(k),  u2  (0)),  u2  (k)Jk_g  is  the  solution  of  RSP  if 


it  solves 

t  N-l 

U2  <k)^.o  "  arg  “ln  J2  <ui  (Z1  (0’  U2  (0))  U1  (Z1 

u2  (k)  6  u2  (k) 

k-0,  --,  N-l 

u2  (N-l),  -,  u2  (0))),u2  (0),  — ,  u2  (N-l); 
and  u,c  (k)  -  u.(Z  (k,u0t(k),u5,(k-l),'“u5  (0)))  Vk€  {0,1,  — , 


N-l} 


2.3.  Problem  Formulation 

In  this  report,  we  consider  specifically  the  discrete,  finite- 
horizon,  linear-quadratic  deterministic  and  stochastic  gaussian  systems. 

The  technique  employed  is  the  dynamic -static  conversion.  The  time 
evolution  is  collapsed  into  a  single  long  column  vector.  The  system  can  then 
be  viewed  as  a  static  entity;  however,  the  relationship  between  these  time- 
vectors  has  to  be  restricted  by  causality.  Thus,  the  techniques  available 
in  the  static  case  can  be  readily  applied  under  the  causality  constraint. 

The  system  under  consideration  is  described  by 

X(k+1)  -  A(k)X(k)  +  BL(k)  UL  <k)  +  B2(k)  U2 (k)  +  W(k)  (2.4) 

k-0,  1,  — ,  N-l  ^ 

UL  (k),  U2  (k)  are  the  controls  of  players  1  and  2  respectively  at  stage  k. 
The  cost  function  of  player  i  is  given  as: 

N-l 

(  lui  (k)},  lUj  (k)})  -  E  lx'(N)  Qi(N)X(N)  \£0[X,(k)  Qi(k)X(k) 

+  (k)  Rii(k)Ui(k)-HJj’  ^kJ^j  (k)U j  (k) ] } ,  i,j«l,2,  i-j  .  (2.5) 

Assume  also 

Qt  (k)  >  0  R  (k)  >  0  i,  j  -  1,2 
W(k)  ~  N  (0,  £w(k)) 

Xo~N  «»•  Z0> 

Note  that  in  the  usual  Nash  formulation,  R12  need  not  be  positive  definite. 

It  will  be  shown  in  Chapter  3  that  this  is  a  necessary  condition  for  RSP. 
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We  convert  the  above  dynamic  system  into  the  static  domain  via  the 
following: 

Define 


rx(o) 

t 

l 

ut(0) 
x  1 

-I/—N 

•O 

N' 

1 

J 

X  -  j  • 

|  X(N) 

_  — t 

ui  V 

1  w  - 

Ui(N-l)j 

—  J 

N-l 

2  *(N-l,j)W(j) 

j-o 

(2.6) 

Then  the  state  equation  collapses  to 


I 

9 

t 

$(nJ6) 

-m 

X(0)  +  2 
i“l 

Bi(0)  "  ^  ° 

T  ^  0 

^(N-1,0)B^(0)--B^(N-1) 

ui  + 

. 

. 

N-l 

2*(N-l,j)W(j) 

J“°  __ 

A  i 

-  D  X  (0)  +  2  H  U.  +  W  (2.7) 

i-1  1 


Similarly, 

Jt  =■  E[X'QiX  +  Ui'  Ru  Ut  +  Uj  R±J  Uj  J  (2.8) 

Qt  -  diag  [  Qi(0),  ~,  Qt(N)  ] 

Rtj  -  diag  [  R^O),  Rtj  (N-l)  ] 


Causality  is  an  important  property  of  the  functional  mappings  in  this  setting. 
It  is  characterized  in  a  simple  way  for  matrices,  namely,  the  block  lower 
triangularity  implies  causality. 

We  therefore  define  the  following: 

Definition  2.5: 


A  matrix  F  ■  f f .  .  ] ,  •  some  matrix  with  known  dimension 

<D  causal  J  if  f. .  »  0  Vj  >i 

strictly  causal  if  f^j  ■  0  Vj  >i 


The  advantage  of  working  in  this  pseudo-static  domain,  as  stated  before, 
is  the  simplicity  of  algebra  and  the  applicability  of  static  result  in  a 
straight-forward  manner.  However,  same  results  may  also  be  obtained  by 
using  dynamic  programming. 

2.4.  Past  Development  in  RSP  and  Team  Problem 


To  study  RSP,  it  is  certainly  necessary  to  solve  the  corresponding 
team  problem.  For  centralized,  deterministic  information  structure,  the 
team  problem  is  the  same  as  the  optimal  control  problem,  the  solution  of 
which  is  of  course  well  known.  Unfortunately,  in  the  general  decentrali¬ 
zation  case,  it  requires  an  infinite  dimensional  filter  to  generate  all 
the  estimates.  Therefore,  due  to  realizability,  additional  assumptions  on 
the  information  have  to  be  made.  One  type  of  assumption  ([18],  [19]) 
restricts  information  to  that  generated  by  a  finite-dimensional,  linear 
filter,  the  optimal  solution  can  then  be  found.  Another  type  of  assumption 
([8],  [9],  [10])  is  the  nested  information  where  observations  are  shared 
with  one-step  delay.  This  report  uses  the  similar  idea  as  ([18],  [19]). 
parameter  optimization  is  used  to  find  the  best  linear  strategies.  Due  to 
the  conversion  to  the  static  domain,  the  sufficient  conditions  are  stated 
in  particularly  simple  forms. 

RSP  is  formally  investigated  by  Basar  [5]  and  Papavassiloupous  [6,7] 
under  perfect  state  information.  Sufficient  conditions  are  obtained  in 
each  case  for  the  RSP  solution  to  exist.  However,  some  issues  are  left 
mostly  unaddressed:  the  effect  of  leader's  information  structure  on  the 


solvability  of  RSP,  RSP  under  large  threat,  possibility  of  suboptimal  RSP 
solution  should  the  sufficiency  conditions  fail,  the  qualitative  interpre¬ 
tation  of  the  conditions  etc.  The  stochastic  RSP  given  the  state  information 
only  is  in  general  unsolved  and  appears  unsolvable  in  the  dynamic  case.  It 
is  solvable  in  the  static  setting,  however,  as  in  [13],  [14].  In  this  report, 
we  include  the  follower's  past  control  in  the  leader's  information  structure 
and  are,  therefore,  able  to  solve  the  problem.  We  also  solve  the  determi¬ 
nistic  decentralized  RSP  under  the  linear  strategy  constraint  (to  bypass  the 
difficulty  in  the  general  team  problem).  The  centralized  deterministic  RSP 
is  also  studied,  and  some  of  the  previously  little  touched  issues  are  ex¬ 
plored.  However,  there  still  exists  a  great  deal  of  open  problems,  especially 
with  regard  to  the  near-optimal  solutions  in  the  stochastic  RSP. 


RESTRICTED  STA.CKELBERG  PROBLEM 


3.1.  Introduction 

Dynamic  RSP  has  been  studied  In  [5],  [6],  In  which,  conditions 
for  enforcing  the  team  solution  for  the  leader  are  obtained  under  perfect 
state  information  in  the  deterministic  problem.  Some  results  on  the 
static  stochastic  RSP  are  presented  in  [13],  [14].  Here  we  first  examine 
the  deterministic  RSP  under  various  information  structures  and  then  the 
stochastic  RSP  under  a  specific  information  pattern. 

The  RSP  is  approached  as  follows: 

1.  Solve  the  team  problem  for  the  leader  under  the  given  information 
structure. 

2.  Choose  one  representation  of  leader's  team  strategy  such  that  it  is 
dependent  on  the  follower's  decision  non-trivially. 

3.  Find  conditions  this  representation  must  satisfy  such  that  follower's 

decision  from  his  own  optimization  coincides  with  the  team  solution. 

We  shall  consider  the  following  information  structures: 

A 

(Let  Z^(k)  ■  information  available  to  U^(k>) 

Deterministic 

a.  Zx(k)  -  [U2(k),  U2(k-1),-,U2(0),  XQ] 


Z2(k)  -  [X(k) ,  X(k-l) , -- ,Xq] 
b.  Zx(k)  -  [X(k) ,  X(k-l) , ,XQ] 
Z«(k)  -  [X(k) ,  X(k-l) ,  —  ,Xj 


c.  Zx(k)  -  [U2(k),  U2(k-l),-,U2(0),X(k),X(k-l),-,Xo] 

Z2(k)  -  [X(k),  X(k-1),~,XQ] 

d.  Z^k)  -  [Y^(k) ,Y^(k-l) ,--,Y^(0)J 
Z2(k)  -  [Y2(k),Y2(k-l),-,Y2(0)J 

(Yj^-)  Y2(.)  are  non-nested.) 

Stochastic 

e.  Z^k)  -  [U2(k),-,U2(0),X(k),X(k-l),-,X(0>] 

Z2(k)  -  [X(k)  ,X(k-l) ,  —  ,X(0)3 

Note; 

1.  We  have  allowed  U^k)  to  be  dependent  on  U2(k).  This  certainly  is  not 
physically  possible  since  U^(k)  needs  a  nonzero  amount  of  time  for 
computation.  However,  here  we  assume  that  the  interval  between  two 
stages  is  long  relative  to  the  delay,  thus,  we  can  consider  the 
strategies  U^k)  and  U2(k)  as  being  implemented  at  the  same  stage. 

If  precision  is  needed  to  include  this  delay  in  the  model,  we  can  sub* 
divide  the  interval  and  let  U^(k)  depend  on  U2(k-1), — ,  U2(0)  only. 

In  either  case,  the  subsequent  results  are  the  same.  Care  only  needs 
to  be  taken  to  restrict  the  matrix  coefficient  mapping  U2  to  to  be 
causal  (block  lower  triangular)  in  the  former  case  and  strictly  causal 
(strictly  block  lower  triangular)  in  the  latter. 

2.  Cases  (a),  (b),  (c)  are  considered  to  examine  the  impact  of  leader's 


information  structure  on  the  solvability  of  RSP.  Case  (d)  in  the 
general  deterministic  decentralized  information,  in  which,  the  team 
solution  cannot  be  obtained  in  general.  Therefore,  the  best  linear 


strategies  are  derived  and  RSP  is  solved  based  on  the  assumption  that 
leader  enforces  these  strategies.  In  the  stochastic  RSP  with  only  the 
state  information,  the  leader  has  no  way  of  enforcing  his  team  solution 
since  the  team  trajectory  depends  on  the  sample  path  of  a  gaussian 
random  process.  In  (e),  we  include  the  follower's  past  strategies  as 
well  so  that  the  leader  can  use  them  for  the  threat. 

The  solvability  of  RSP  is  also  viewed  from  the  asymptotic  behavior  of 
the  follower's  strategy  as  a  function  of  the  strength  of  leader's 
threat.  It  is  shown  that  under  some  mild  conditions,  if  the  leader 
threatens  to  play  an  infinite  control  for  any  deviation  of  the  follower's 
strategy  from  the  desired  value,  leader  team  solution  can  be  enforced. 
Since  infinite  gain  is  not  physically  possible,  we  examine  the  possibil¬ 
ity  of  a  large,  finite  threat.  It  is  shown,  with  aid  of  an  example,  that 
arbitrary  closeness  to  the  team  cost  may  not  be  forced  with  a  linear 
representation  no  matter  how  large  (but  finite)  the  threat  is.  However, 
if  discontinuous  strategies  are  allowed  for  the  leader,  it  can  be  shown 
that  arbitrary  closeness  to  the  team  cost  can  then  be  achieved  with  a 
large,  finite  threat. 


(3.2) 


Using  the  deterministic  counterpart  of  Radner's  Theorem  [1],  set 


v  [V^J  -  0 


where 


P  -  diag  [P(0),--,P(N-1)] 


i  -  1,2 


P(k)  *  projection  onto  the  space  spanned  by  [X(0) , — ,X(k)] 
Then, 

Uit  -  -P  Qx  Hi  +  R^)"1  Ql  (DXq  +  H  U.)] 

»  -P  [R12_1  Hi*  QL  X] 


(3.3) 

(3.4) 


(3.5) 


Note  that  we  have  used  the  assumption  >  0»  since  otherwise  impulsive  \}^ 
may  result. 

We  notice  that  has  a  non-causal  dependence  on  X.  Therefore,  we  use  the 
following  transformation  to  obtain  a  causal  representation. 

Proposition  3,1 
Given  as 

"i-  -p  Rii'll,t‘<>ix 


Assume 


Gk" 


where 


A  (1> 
K,k+1 


1  -  'k>  Viii  b2  (k) 

<k>  <k> 


is  invertible 


(3.6) 


J  [Rli'1(k)  Bt'  (k)  (l-l,k)  Ql(4)Ji^l  (A  (j)  + 


Bx  (j)  gx  (J)  +  \  (j)  82  (j))] 


(3.7) 


MW 


8X  U) 

gj  (J) 


d  (1> 

j,j+l 

d  <2\ 

jj+l 


A(j) 


(3.8) 


Then  (k)  ■  (k)  X  (k)  is  a  causal  version  of  (3.5), 

Proof 

See  Appendix  1. 

He  write  the  closed  loop  solution  as 


Q.E.D. 


(3.9) 


where  is  block  diagonal  with  components  as  calculated  in  Proposition 
(3.1). 

It  is  well  known  that  the  open  loop  and  closed  loop  versions  of  the  control 


lead  to  the  same  state  trajectory.  For  the  open  loop: 
Xfc  -  Gl  -  H2  G2)_1  D  Xq 

"i*  - 

-  (I-Hl  Gl  -  Hj  G2)'L  D  Xq 

A  0 

-  G.  X 

i  o 


(3.10) 


(3.11) 


Remarks 


With  state  feedback  the  first  order  condition  should  actually  be 

“i +  p  {  'V  Qi  ®x„ +  Hj  °j>i  +  7Ui  ujt<Hj'QiKj  +  Ru)  uj 

+  Hj  *  Q1(DXq  +  K±  1^)]^  -  0 


v  %  -.a.  .n  /.  ,v 


One  solution  is  the  pair  (3.5),  which  is  the  open  loop  solution.  The  only 


case  non¬ 


uniqueness  may  occur  is  when  V  U,  ^  U, 

i  J  i 


But 


u . 


U.  V 

■i  J  Uj 
and  Ker 


Vx  Uj  Hi  7x  Uj  Hj 


Hj  *  0 


V  J  u.  V  .  u. 
ui  j  uj  i 


*  I 


The  closed  loop  team  solution  is  therefore  the  same  as  the  open  loop 
solution  in  the  sense  they  both  achieve  the  lower  bound  of  leader's  cost. 
It  is,  however,  immediately  noticed  that  such  advantage  is  not  enjoyed  in 
the  multicriteria  case,  e.g.,  Nash  or  Stackelberg.  In  these  cases,  nested 
information  is  used  to  eliminate  V  u.  terms. 

ui  J 


3.2.2.  Conditions  for  enforcing  the  team  solution 
(1)  Sufficient  conditions 

In  this  section,  we  derive  the  sufficient  conditions  for  the 
leader  to  enforce  his  team  solution  using  non-uniqueness  of  representation 
of  his  team  strategy.  The  condition  will  be  composed  of  the  first  order 
stationarity  condition,  the  second  order  convexity  condition,  and  the 
additional  assumption  that  if  the  leader's  strategy  is  fixed,  follower's 
optimization  admits  his  part  of  the  team  solution  as  a  globally  mimimizing 
solution.  The  stationarity  and  the  convexity  conditions  are  investigated 
further  for  any  differentiable  representation  of  leader's  team  strategy. 
More  specific  conditions  are  then  obtained  for  each  case.  For  linear 
representation,  it  is  shown  that  the  convexity  condition  and  the  global 
minimum  condition  are  always  satisfied. 


(3.12) 


We  choose  a  representation  of  as 

U1  "  Y1  (Z1  (U2}) 
where  (*)  is  chosen  to  satisfy 

(1)  arg  min  J2  (Y L  (Zx  (U2)),  U2)  -  U2  (3.13) 

(2)  Yl  (Zx  (U2C))  -  U1t  (3.14) 


We  define  functions  with  property  (2)  as  class-T  functions.  The  objective 
here  is  to  find  sufficient  conditions  for  9  under  information  structures  (a), 
(b),  (c),  given  Y^  (•)  a  class-T  function.  The  information  available  to  the 
leader,  (•)  is  some  function  dependent  on  U2  in  a  causal  manner.  If  Z^ 
is  independent  of  U2>  leader  will  have  no  way  of  influencing  the  follower's 
optimization. 

We  now  state  the  sufficient  conditions  and  the  proof: 

Theorem  3.2 
Assume 


(i)  Y  (Z^)  is  a  causal,  differentiable,  class-T  function 
(ii)  J2  (Y  (Z^),  U2)  is  convex 

(Hi)  z2=fxo} 

(iv)  S  F  -  V  Y(z.)  I  (V  A  total  differential  with  respect  to  U 
1  U2-U2C  (3.15) 

>[(R22G2  +  H2Q2)  +  F*  (R^  +  Q2)]  (l-H1G1-H2G2)’1  D-0  (3.16) 

Then  U.  •  Y(Z1)  will  force  U  to  adopt  U-t. 


20 


F 


. •  .-i  .v  .-.V-  . . :v  >  V- : 


& 


Proof: 

If  *  Y  (Z^)  is  a  class-T  function,  J2  is  convex,  P  J2]  j  ■  0, 


U2  -U2 

then  the  global  mimimum  of  (Y  (Z^)  U9 )  is  attained  with  the  pair 

t  t  : 

(U.  ,  U«  ).  Therefore,  it  sufficies  to  show  that  P  [V  j  ]  |  »  0 

z2  U2 

U2  "  V 

implies  condition  (iv) .  (P  is  the  projection  onto  the  space  spanned 

z2 

by  z2.) 

We  know  that  knowing  XQ  is  sufficient  to  achieve  the  lower  bound  of  the 

cost  function  in  a  deterministic  control  problem.  And  since  IX  3  and  Cx} 

o 

are  equivalent  in  the  sense  that  they  both  achieve  the  minimum,  we  can 
substitute  P  (projection  as  in  (3.3),  (3.4))  for  P„  . 

m  A 


P  l\  J2]  *  P  [(7U2X)  '  Q2X  +  (7U2Y)  '  R21Y  +  ®22U2^  “  0 


(3. 


v. 


d 


t*;' 

i*.' 


7U2X-H2+H1  7U2Y 

P  [(*D  Y  (Z^)'  (El  Q2X  +  R2l  Y(Zj»  +  H2Q2X  +  R^]  -  0 

Let  U2  -  G2Xfc  ,  then  Y  (U2)  -  GjX* 

It  is  sufficient  then 

[(7U2  Y<Z1>^/  (hj.  G2  +  R21  Gl>  +  (H2  Q2  +  **22  G2)J  X*  *  0 

t 


U, 


U 


X  -  X 


for  all  possible  X  . 


18) 


'l- 

■■  »  ■  ■,■  'j>  J'  r^,1  -  11 11  ■  '•'  -  1  ?  » «'■  ■  .  v.  v 

..1...*..  *.4  ....  ‘  k  *  ►  '  *  “  «  ■.•»’.►  ‘ 

kV 

L*  ' 

i  t 

►.  * 

’?*  •  ‘ 

21 

i  ■ 

c.  > 

iv 

XC  -  (I-H1G1  -  H2G2)_1  D  Xq 

i*-'  .-  - 

and  there  is  no  restriction  on  X  . 

o 

§  - 

“  t(VU2  Y  (zi)j  *'  (H1Q2  +  hi0!*  +  (VQ 2  +  R22G2)1 

(i-h1g1-h2g2)’1  D  =  0 

>?*’  t,.j 

t  Z' 

U2  -  U2 

Q.E.D. 

i 

X  -  xc 

™  . . 

Discussion: 

»•  ,* 

1.  The  above  theorem  holds  for  the  information  structures  (a),  (b),  (c). 

? 6 

However,  the  solvability  differs  on  each  case  due 

to  the  different 

• 

*U  V  (Z^)  expressions.  Let  F  be  defined  as  in  (3 

.16). 

•* 

1  g 

For  (a),  Zx  -  [u2  Xq},  Z2  -  iXQ3  : 

«•  ■ 

v„2  .•  Cu2)  1  -  p 

(3.19) 

« 

■  — 

“2  ■  D2t 

P  ^ 

For  (b),  -  [x}  ,  Z2  -  1X3 

\  Y  (X)  -  7X  Y  (X)  ^X 

•* 

J-! 

V  ’  Hi’*  Y  <x>  V  +  "2 

’• 

-  (I-H^x  Y  (X))’1  Hj 

H  i- 

r* 

We  need 

i*  ,  * 

»'  *  »v; 

7X  aX)  i  (I-H^  i  (X)if1  H2  -  F 

x  -  xc  x  -  xc 

*  c 

•* 

< 

or  (?x  Y  (X))  1  (H2  +  Hx  F)  -  F 

(3.20) 

> 

's* 

‘  • 

x  -  xfc 

h 

**■ 

^ •  —  »  *  < »  »  •  *  •  ,  - 

22 


For  (c),  Z1  -  UJ2,  x3,  Z2  -  lx} 

Let  denote  gradient  with  respect  to  ith  variable,  7  denotes  the 

total  gradient. 


X  -  Hl  (Vx(1>  Y  (X,  U2)  +  *0  (2)  Y  (X,U2))  +  Hj 

-  *x(1)  1  (X,  U2))  *L  (Hl  *  (2)  Y  (X,U2)  +  H2) 

VU2  y  (X’V  *  7X(I>  Y  (X*U2)  Y  (X’U2> 

«  Vx(1>  y  (X,U2)  (1-Hl  Vx<1>  y  (X.l^))'1 

(Hl7U2(2)  Y  <X»V  +  H2>  +  7u2<2>  Y  (X’U2) 

^x(1>  Y  (X,U2)  I  (Hj  +  HjF)  +  V  (2>  Y  (X,U2)  1  -  F 


(3.21) 


X  -  X 


X  «  X 


U2  -u2 


U2  "U2 


Note  that  (3.21)  reduces  to  (3.19)  or  (3.20)  if  }  y  (X,U2)  or 

V r  ^23  y  (X.U,)  is  set  to  zero  respectively. 

U2 

As  a  design  method,  F  is  first  solved  from  (3.16).  Then  depending  on 
the  information  structure,  the  appropriate  Y  (•)  can  be  chosen. 
However,  the  ability  to  choose  i  (*)  differs  in  each  case.  Given  F: 


in  (a),  (c) ,  V  t  can  always  be  solved. 


in  (b),  it  is  necessary  and  sufficient  Ker  (Hj  +  F)  c  Ker  F.  (3.22) 

Even  though,  given  F,  (c)  does  not  seem  to  offer  anything  extra  in  terms 
of  the  solvability  of  RSP,  the  additional  term  in  (3.21)  does  provide 
freedom  to  possibly  attain  other  desirable  features  (e.g.,  sensitivity, 
convexity,  etc.  In  comparing  the  three  information  structures,  we  con¬ 
clude  that  (b)  is  more  restrictive  than  (a)  and  (c).  (c)  offers 

additional  freedom  to  fine  tune  other  features  of  the  solution. 

Some  of  the  assumptions  in  the  theorem  may  seem  restrictive,  however, 
in  fact,  they  are  due  to  reasonable  necessity. 

Assumption  (i)  restructs  the  class  of  leader's  strategies,  causality, 
and  class-T  are  necessary,  differentiability  helps  to  carry  out  optimi¬ 
zation  analytically.  In  general,  these  are  not  very  stringent  since  a 
large  class  of  functions  still  remain. 

Assumption  (ii)  states  the  convexity  condition.  It  is  necessary  to 
guarantee  the  existence  of  at  least  one  local  relative  minimum. 
Assumption  (iii)  restricts  the  theorem  to  the  perfect  state  information. 
The  decentralized  case  will  be  treated  separately  later. 

Assumption  (iv)  is  the  stationarity  condition.  The  expression  in  (3.16) 
is  necessary  and  sufficient  provided  F  is  finite  (the  infinite  gain  case 
will  be  discussed  later) . 

Note  that  V  j  (z.)  I  ■  constant  matrix  imposes  a  strong  restriction 


on  non-linear  functions,  since  it  implies  that  all  the  terms  with  order 
higher  than  or  equal  to  two  will  have  to  vanish  on  the  team  trajectory. 
Furthermore,  it  will  be  shown  that  convexity  condition  also  becomes  very 
restrictive  for  the  non-linear  functions.  Both  of  these  suggest  that 
linear  strategy  as  the  ideal  condidate  since  they  are  trivially  satisfied. 
However,  for  the  state  information  case,  it  is  shown  in  [21]  that,  in 
certain  examples,  only  non-linear  solutions  exist.  Another  property  to 
notice  is  that  if  an  F  exists  in  (3.16),  it  is  causal.  This  is  certainly 
necessary  for  a  linear  strategy. 

(ii)  Stationarity  and  Convexity: 

In  this  section,  we  examine  the  stationarity  and  the  convexity  conditions 
((ii),  (iv)  respectively  in  more  detail.  The  stationarity  condition  is 
expressed  in  geometric  language.  The  sufficient  condition  for  convexity 
is  derived. 

Stationarity: 

N(N+1) 

In  (3.16),  we  have  A  ^  '  (m^  xa^)  unknowns  (due  to  the  causal  structure 

of  F),  and  N  (nxm^)  equations.  Assume  that  all  equations  are  independent. 

2q 

Then  we  require  N  >  —  -1.  If  equality  holds,  the  solution  F  is  unique. 

-  m^ 


If  strict  inequality  holds,  there  are,  in  general,  infinitely  many 

solutions.  The  advantage  of  this  freedom  and  the  ways  of  utilizing  it 

requires  further  study.  If  inequality  fails,  we  then  have  to  solve  F  as 

2 

a  function  of  X  ,  in  which  case,  N  ^  — -  -1  always  holds. 


In  the  case  of  state  information  (information  structure  (b)),  we 
have  the  additional  equation  (3.20)  to  solve.  Observe  that  in  the  case  of 


linear  strategy,  Vx(y(x)),a  constant  matrix  must  have  the  last  n  columns 


equal  to  zero  due  to  causality  restriction.  Therefore,  for  (3.20),  the  last 


m^  columns  of  F  (Nm^m^  elements)  must  also  be  zero.  We  have  then 


N (N+l)  _  _  „ _  N(N-l) 


2  min2  ~  ^mlm2 


2  mlm2 


2n 


unknowns  and  Nnm„  equations.  Thus,  we  need  N> - HI  in  general,  i.e.  if  all 

2 


equations  are  independent.  Putting  together  the  above  constraint  and  (3.16), 
we  have: 

Proposition  3.3 

The  equation  (3.16)  has  a  solution  F  if 

(i)  ker(R21G1+H;[Q2)  n  Imd-HjGj-H^r^C  ker (R^+H^)  n  Imd-H^-H^)"^ 

(3.23) 


(ii)  rank(R21G1+H^Q2)  (I-H^-H^)”1©  >  rank(R22G2+H2Q2)  (I-H1G1-H2G2)'1D 


(3.24) 


2n 


(iii)  N  >  —  -  1. 
ml 


Proof :  Write  (3.16)  as 


A  -  -F'B. 


It  is  necessary  and  sufficient  that 

ImA  -  ImF'B 

and  ker  A  -  ker  F'B. 

The  first  condition  and  one  direction  of  the  second  condition  (ker  ACker  F'B) 
is  taken  care  of  by  choosing  F  appropriately  and  condition  (ii) . 
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2n 


We  know  this  is  possible  because  N  >  —  -  1  implies  number  of  unknowns  is 


greater  than  the  number  of  constraints. 

Now  we  need  ker  Acker  F'B.  Since  F  is  free  to  be  chosen,  we 
only  need  to  require  ker  ACker  B.  Substitute  for  A  and  B  with  respective 


expressions ,  the  result  follows , 


Q.E.D. 


Condition  (i)  means  that  for  all  possible  team  trajectories,  x  , 
u1R21u1  +  ui®i^2xt  *  ®  *  U2R22U2  +  u2H2C^2xt  *  ^ '  This  is  certainly  necessary  since 
if  u^R2iui +  uiRi^2xt  **  ®  an<*  R22U2  +  H2^2xt  *  £  ^  then  u^  is  not  the  optimal 
solution  for  the  follower,  while  u^  ■  PtR^CH^C^x^e)  1  is.  Condition  (ii)  simp 
simply  requires  the  number  of  unknowns  to  be  greater  than  the  number  of 
equations . 

-  Convexity 

Recall  that  U2  is  defined  as  the  set  of  all  functions,  u2,  measurable 


with  respect  to  the  a-algebra  generated  by  the  information  structure.  U„ 


is  certainly  convex  since  if  “^jU^eU^,  ou^  +  (l-oOu^^  e  U2 .  We  have 


assumed  the  differentiability  of  y(z^),  therefore,  is  convex 


over  U2  if  and  only  if  J2(y(zj),u2)  >  0 


7u  J2(Y^Z1)  ,U2^  “  ^7U  Y(*j))  '(R^yU-j)  +h[Q2x^  +  r22u2  +  H2Q2x1 


7  j2(Y(z1),u2)  -  r22+(h1v  v(z1)  +  h2)'q2(h1v  y(Z;l)+h2) 


+  2  t(7u  Y^zi^  (R21y(*1)  +h^Q2x)  +  (R^yU^ 


(3.25) 


Since  ^2*  ^2’  R21  are  Pos*c*ve  definite  or  positive  semi-definite,  we 


only  need 
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(  u2Y (z1> > <R21y (zl) +  H1Q2x)  +  (R21Y (zl)  +  HlQ2x) ' (Vu2Y (zl} }  -  0 

Vx  and  y(z^)  generated  by  U2€G2' 


(3.26) 


If  V  y(z.)i<0,  the  above  condition  is  very  difficult  to  be  satisfied.  The 
u2 

c  2 

reason  is  that  if  we  consider  7  y(z  )  evaluated  at  a  particular  u_,  we  only 

Ua  X  z 

^  2 

need  (R„  y(z1 ) + H’Q^x) |  to  have  the  opposite  sign  of  V  y(z.)  in  one  of  its 

-*•  *  ^  IU  U2  1 

orthogonal  coordinates.  This  immediately  suggests  the  desirability  of  the 

2 

linear  strategy,  since  V  y(z  )»0  in  that  case 

u2  1 

(iii)  Linear  Strategies 

From  the  discussion  in  the  previous  sections,  we  see  that  the  non¬ 
linear  representation  of  leader's  strategy  does  not  offer  any  advantage;  in 
fact,  considerable  care  needs  to  be  taken  for  convexity.  Therefore,  we  now 
specialize  our  attention  to  linear  representation  only. 

Proposition  3.4 
Assume 

(i)  y(z^)  is  a  causal,  differentiable,  class-T  function 

(ii)  z,D{x  } 

L  O 


(iii)  3F3 


[  (R22G2+H2Q2)  +  F'  04iGi+HIV  ]  (I>fH1GJ+ii2G2)  *  °- 


(iii)  '  If  z^»{x},  z2*{x},  assume  3K9 
(K+GjMHj  +  HjF)  -  F. 

(iii)"  If  z^»{x,u2>,  z2»{x),  assume  3K3 
(G1~KG2)  (H2  +  H^F)  +  K  -  F. 


(3.27) 


(3.28) 


(3.29) 


(3.30) 


for  z1  =  (u2>,  z2  -  {x},  ^  -  F(u2-G2xq)  +  G^xq; 

for  z1  -  {x},  z2  -  {x},  u2  -  K(x-xC (xq) ) +  G^x; 

for  z1  «  {u2,x},  z2  -  {x},  u2  *  K(u2~G2x)  +  G^; 

will  force  u2  to  adopt  u2,  respectively. 


(3.31) 

(3.32) 


Proof:  Substitute  the  expression  of  u^(«)  into  Theorem  3.2,  the  result  then 


follows . 


Q.E.D. 


Note  that  the  convexity  condition  vanishes  due  to  the  fact 

2 

V  u  (z.)  *0.  The  conditions  are  easy  to  verify  since  they  only  involve 

U2  1  X 

linear  equations.  The  gain  matrices  are  all  causal  (if  they  exist),  therefore, 
the  solution  is  also  realizable  (causality  is  ensured  VG1»G2  in  diagonal 


or  noncausal  representation) . 


3.2.3.  Examples 

We  shall  examine  some  simple  scalar,  2-stage  examples.  Team  and 
RSP  under  information  structures  (a),(b)  are  solved  using  the  technique 
derived  before.  The  RSP  solutions  are  verified  by  substituting  them  back 
into  J2  and  solve  for  the  optimal  u2 .  The  effect  of  weighting  matrix 
coefficients  on  the  solvability  and  the  implication  of  different  information 
structures  ((a)  vs.  (b))  are  clearly  illustrated. 

Consider  a  scalar,  2-stage  system 

x(2)  -  2x(l)  +  u^l)  +  u2(l) 


x(l)  -  x(0)  +  u- (0)  -  u0 (0) 
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J1  -  2x2(2) +  x2(1)  +  x2(0)  +  2u2(l) +  u2(0)  +  u^Cl)  +  u2 (0) 

J2  -  x2(2)+x2(l)  +  2x2(0)  +  au2(0)+bu2(l)+cu2(0)+du2(l). 


Apply  static-conversion 


x(0) 

1 

0 

0 

X  * 

x(l) 

- 

1 

x(0)  + 

1 

0 

x(2) 

2 

2 

1 

1 

0 

o' 

*1 

o' 

J,  - 

x' 

0 

1 

0 

X  +  u' 

1 

0 

0 

2_ 

1 

_0 

2 

2 

0 

o" 

a 

o” 

J2  * 

x' 

0 

1 

0 

X  +  u’ 

L 

0 

0 

lj 

1 

0 

b 

o 

H 

3 

1 _ 

+ 

0  0 

-1  0 

u2(°) 

Lui(1>J 

-2  1 

u2(l) 

U1  +  U2U2 


U1  +  u2 


u„ 


Team 


The  non-causal  control  laws,  from  (3.5),  are 


Using  Proposition  3.1,  we  transform  them  to  the  causal  representation 


0 

0 


x 


0 


-1 


0 

0 


x. 


The  team  trajectory  is 


r  1 


x  * 


1 

1/7 


x(0) 


and  the  open  loop  control  law  is 


3/7 
-1/7 

RSP 


We  now  assume  linear  strategy  and  apply  Proposition  3.4. 
-  Information  structure  (a) 

Let 

u.  *  F (u_-G°x  ) +  G°x  . 

1  L  L  o  1  o 


F  satisfies 


IF'  (H|Q2  +  R21Gx)  +  (H2Q2  +  R22G2)  ]  d>fH1G°+H2G°) 


Restrict  F  to  the  causal  structure  F 
values,  we  obtain 


fi  0 
f2  f3. 


0 


Substitute  in  numerical 


(4-68)^  +  <l-b)f2  -  4-6c 


f 


3 


2d-l 
1-b  * 


We  notice  immediately  that  F  is  nonunique  (3  variables  and  2  equations) , 
however,  given  a,b,c,d,  f^  is  unique.  This  points  out  the  possibility 
that  given  the  weighting  parameters,  we  can  tune  F  to  achieve  better  per¬ 
formance  in,  say,  parameter  sensitivity;  or,  given  the  desired  F,  we  can  tune 
the  parameters.  We  can  also  check  that  N>|^-  1  (2  >  1) .  Therefore,  provided 
equations  are  all  independent,  we  should  have  1  (2-1)  degree  of  freedom). 
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For  some  values  of  a,b,c,d,  F  may  not  exist  at  all  (this  points  to 

the  importance  of  suboptimal  strategy)  even  in  this  simple  example,  e.g., 

4  4 

b*l,  a*  —  ,  c#t.  However,  we  are  able  to  say  F  exists  generically. 
o  o 

To  verify  that  the  stated  strategy  does  enforce  team,  we  substitute  in  some 


numerical  values  for  (a,b,c,d)  and  solve  for  the  optimal  u2> 


a  =»  c  =°  d  =  1  b  *  0 


1^(0)  =  u2(0)  -y  x(0) 

Uj^d)  »  u2(l)  +-^  x(0)  . 

3J„ 


T1  0 

■  Lo  i. 


set  f2=0  arbitrarily 
(diagonal  structure) 


3J, 


Substitute  in  and  set  y-yyy  =  3— (jy 


0,  we  obtain 


as  expected. 


u2(0)  «  y  x(0) 

riv  x(0) 

u2(i)  -  -  -y 


a»b“0  (R^-O,  *'e-»  ui  not  Penalize<*  directly  in  J2) 

c  -  d  -  1. 

Set  f„*0  arbitrarily 

'-i  ° 

0  1 


U2(0)  3 

Ul(0)  * - 2 - 14  x(0) 


<*!<«  "  »2(1)  + 


Substitute  in  J2  and  carry  out  the  minimization.  We  obtain 

u2(0>  •  y  x(0) 


c 


u2<l)  -  -  * 


as  expected.  In  the  second  case,  even  though  does  not  enter  directly, 
it  does  affect  through  x. 

-  Information  structure  (b) 

Now  we  examine  RSP  when  only  the  values  of  the  state  variables  are  available 
to  the  leader.  We  shall  see  that  the  solvability  becomes  very  stringent. 
(For  generic  solvability  weneed  N>  •5^+  1,  N=2,  ^+1=3,  thus  the  example 
here  is  in  fact  generically  unsolvable.) 

Consider  now  the  representation 

ux  -  K(x-(D  +  H1G°  +  H2G°)x(0)) +GlX 


where  K  solves 


(K  +  GjMHj  +  H^)  -  F. 


F  is  the  same  as  in  the  last  section.  Let 

0 


K 


ki  0 
k2  k3 


k^  -  arbitrary  f^-fyO 
k2  ■  arbitrary 

k3  ’  2*V 


k^,k2  do  not  enter  into  the  solution,  since  u^  cannot  deduce  any  information  of 
u2  from  x(0) ,  the  dependence  of  x(0)  has  not  consequence  to  the  solution. 

The  same  reasoning  tells  us  that  a  penalty  on  u^(0)  will  also  have  no  effect 
on  the  solution. 

Note  that  given  (a,b,c,d)  f^  is  determined  uniquely.  Therefore, 
f2*0  is  a  strict  requirement  on  the  problem  (if  d^j,  the  problem  has  no 
solution).  This  coincides  with  the  statement  before  that  since  Ni=p-+1,  the 


problem  is  generically  unsolvable.  Here  we  proceed  with  the  assumption  d= 
in  order  to  verify  that  the  strategy  does  indeed  enforce  the  team  solution. 


-  Let  c=l,  b=0,  then  f^ 

K 

Ul(1)  =  14  X(0)  ‘  2u2(0) 
u2(0)  =  y  x(0) 

u2(l)  «  -  y  x(0). 


-  Let  c 


1 

2  ’ 


b=0,  then  f 2= 1 


Lk2  -fj 

Ui(i)  =  _  *^1+^(0) 

u2(0)  -  y  x(0) 
u2(l)  =  -  y  x(0) . 


3.3.  Behavior  of  Leader’s  Cost  Under  Large  Threat 

A  natural  question  to  pose  after  obtaining  the  results  of  the 
previous  section  is  what  can  be  done  when  there  exists  no  solution  to  the 
set  of  conditions  stated.  It  will  be  seen  in  this  section  that  under  certain 
mild  conditions,  infinite  threat  from  the  leader  (i.e.,  leader  threatens 
to  drive  the  follower’s  cost  to  infinity  if  the  follower  does  not  perform 
as  desired)  can  achieve  the  team  solution  for  the  leader.  It  therefore  seems 
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promising  that  perhaps  near-team  cost  can  be  attained  by  using  a  very  large 
but  finite  threat  when  the  previously  stated  conditions  are  not  satisfied. 
However,  this  expectation  will  be  shown  to  be  false  in  general  if  the  leader's 
representation  is  continuous  in  u^ .  No  matter  how  large  (but  finite)  the 
leader  wants  to  penalize  the  follower's  deviation,  he  cannot  achieve  arbitrary 
closeness  to  his  team  cost. 

3.3.1.  Solvability  of  RSP  Under  Infinite  Threat 

We  shall  be  concerned  with  linear  representation  of  the  leader's 
strategy  only.  We  study  the  solvability  of  RSP  when  the  threat  in  the  leader's 
strategy  is  weighted  by  a  gain  that  tends  to  infinity.  It  is  shown  that  under 
some  mild  conditions  RSP  is  solved. 

Without  loss  of  generality  (in  the  class  of  deterministic, 
centralized  information  structures) ,  we  assume  information  structure  (a) . 

Assume  we  adopt  the  representation  (3.30)  for  the  leader's  strategy 

ul(u2)  -  F<VG2Xo)+GlV 

and  assume  the  optimal  strategy  of  the  follower,  given  that  the  leader  has 
announced  his  strategy,  is 

u»  *  G°x  +  AGx  ,  (3.33) 

i.  l  o  o 

where  (G°x  ,G?x  )  is  the  team  solution  pair.  From  (3.18), 

1  o  l  o 

p[F’(H|Q2x+R21u1) +H^Q2x+R22u2]  -  0  (3.34) 


u,  (u?)  -  FAGx  +G?x 


(3.35) 


x  -  Dxo  +  H1u1(u*) +H2u2 

-  Dx  +H.G°x  +H.G°x  +H.FAGX  +  H-AGx 

o  1  1  o  2  2  o  1  o  2  o 

-  (D  +  H.G°  +  H.G°)x  +  (H.F  +  HjAGx 

1  1  l  l  o  1  2  o 

-  x*  +  Ax. 


Note  that 


G°x  *  G.x1"  (by  the  definition  of  G.). 
i  o  i  l 


(3.36 


(3.37 


(3.38 


Rewrite  (3.33),  (3.35)  using  (3.38)  and  substitute  together  with  (3.37)  into 


(3.34) 


P [  (F '  (H^Q2+R21G1)  +  H ' Q2+R22G2) X*1  +  F '  (HjQ2Ax+R21FAGxQ) 


+  h^q2ax  +  r22agxo1  -  0 


{  [F*  (H^+R^)  +  H2Q2+R22G2)  <d+  H^J+H^)  ]  +  [F'  (»  ]Q2  O^F+H^  +  R^F) 


+  H2Q2(H1F+H2)  +  R221AG}xo  “  0 


Since  xq  can  be  any  vector  in  R  ,  we  have 


{ [  (f*  (h^q2+r21g1)+h^q2+r22g2)  (d+h1g°+h2g2)  ]+[r22+«^q2h2+f,h1q2h1f 
+  f,r21f+f'h^q2h2+h^QjH1f]ag)  -  0 

G  -  -[(R22+H’Q2H2)+F'(H^Q2H1+R21)F  +  F'H^Q2H2+H’Q2H1F]_1[(F,(H'Q2 
+R2!Gi)  +  H’Q2+R22G2)  (I>fH1G°+H2G2)  ]  . 


(3.39 


(3.40) 


If  F  satisfies  (3.27),  then  AG*0,  and  the  leader's  team  solution  is  enforced. 
If  there  exists  no  F  satisfying  (3.27),  the  team  solution  is  still  attainable 
by  the  following. 


Proposition  3.5 
If  F*0  ■  0  VP 

F  » 

and  (H^Q2H1+R21) 

is  nonsingular,  then  the  representation  of  the  leader's  strategy  as  in  (3.30) 

will  force  the  follower  to  adopt  the  corresponding  team  strategy. 

m  *m2 

Proof :  Let  then  along  any  direction  in  the  R  space  the  denomi- 

2 

nator  is  of  0(1  FI  )  and  the  numerator  is  of  0(1! FI).  Therefore,  AG->0 
componentwise,  AG  =  0  implies  the  leader's  team  solution  is  enforced.  Q.E.D. 

The  above  result  is  theoretically  useful  since  it  says  that  RSP 
is  always  solvable  for  this  information  structure  provided  that  infinite  gain 
is  possible.  However,  the  infinite  threat  is  not  physically  realizable, 
therefore,  it  is  natural  to  ask  whether  the  team  cost  can  be  approached 
arbitrarily  close  given  a  finite  gain  that  is  large  enough. 

3.3.2.  Effect  of  Finiteness  of  Threat 

It  is  shown  in  this  section  that  if  we  consider  F  not  identically 
equal  to  infinity,  the  largeness  of  F  will  not  enable  the  leader  to  approach 
team  cost  arbitrarily.  A  key  assumption  in  Proposition  3.5  is  that  F-0*0, 
which  means  that  even  though  F  is  an  infinite  threat,  if  the  follower  plays 
team  exactly .  the  threat  will  have  no  effect.  However,  for  the  case  F  being 
finite  (no  matter  how  large),  the  follower's  decision  cannot  be  made  exactly 
team  (first  order  condition  in  Section  3.2  is  assumed  not  satisfied).  The 
deviation  can  be  shown  ~ 0(1  FI  ^) ,  which  is  then  amplified  by  F.  Therefore, 
there  will  be  a  sizable  deviation  in  the  leader's  cost. 


We  first  examine  the  effect  of  control  offsets  to  the  leader's 


cost 


Then 


u,  -  G°x  +  FAGx 
1  1  o  o 

u-  =  G°x  +  AGx  . 

L  L  O  O 

x  -  Ax  +H,G°x  +  H°G°x  +H-FAGX  +  H„AGx 
o  1  1  o  2  2  o  1  o  2  o 

-  xC+  OLF  +  HjAGx 
1  l  o 


J1^U1,U2^ 


(x^Ax)  'Q^(xC+Ax)  +  (u^+Au^)  'R^(u*+Au.) 
+  (U2+Au2) 'R12(u2+Au2) 


AJX  -  x;(AG'  [  (H1F+H2)  ,Q1(IM-H1G°+H2G2)  +  F’RuG°  +  R^G®] 

+  [  (IlfH^+H^®)  •Q1(H1F«2)  +gJ,RuF  +  G2,R12]AG 

+  AG '  [  (H1F+H2)  ' Qx  (H1F+H2)  +  F ' R1]_F  +  R^  ]  AG)  xq  . 


If  the  leader's  cost  is  continuous  with  respect  to  JF|,  letting  F-*-00  in 

(3.43)  should  imply  AJ^->-0.  However,  we  will  see  that  in  general  it  is 

not  true  by  deriving  lim  AJ  .  A  F  -► 00 ,  we  retain  the  dominant  terms 

J  F«-»»  1  0 

only 

•  AG~-(F’  (H^Q2H1  +  R2]_)  F) ' '  (HjQ2  +  R21Gl) 

AJX  ~  AG'F'((H^Q1H1+R11)G°  +  H^Q1(I>fH2G2) 

-  (H'Q1H1+R11)F(F’ (H^Q2H1+R21)F)"1F'(H^Q2+R21G1)) 


where  ker  F-0  has  been  assumed  (generically  true  if  m.  >  m.) . 


Since  ker  F  *  0  orthogonal  matrices  U ,V 

F 

F  -  U  ' 


U’FV' 


where  F  is  nonsingular.  Then,  let  R*  +  R^  >  0, 

F~ 


Let 


FCF'RF)"^'  =  F(Vf  [F*  0]U' 


RU 


L  0  J 


V)-1F’ . 


U’RU 


R1  R2 
LR2  R3  J 


(1  U'RUR  -  I  RI  ) 


F(F'RF) _1F  -  FV '  (F ' R^F) ”^VF ' 

-  U (U ' FV ' )  F_^R~^F '  ~1  (VF  'U)U ' 

"  U[o]  °3U' 


-  UR^^U' 


AG’F’  -  -H|Q2  +  R21G1)'F(F,(H^Q2H1  +  R21)F)"1F’ 


As  F  -*■  ® 


AG'F'  -*--(H[Q2  +  R21G1)  ’UR~V  as  derived  before 
AJX-  -(HjQ2  +  R21GX)  'UR'S’  [  (H{Q1H1+Ru)G°  + 

-  (h^q1h1  +  ru)ur'1u,(h^q2  +  r21g1)] 


which  in  general  is  nonzero,  thus  proving  the  asserted  result. 
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It  is  certainly  of  importance  to  investigate  in  the  case  of  finite 
threat  and  failure  of  the  first  order  condition  whether  there  exists  a  near 
optimal  strategy  for  the  leader.  One  method  is  to  assume  one  representation 
(selected  from  the  linear  class)  and  perform  parameter  optimization.  A 
possible  conjecture  is  that  the  minimum  solution  F  in  (3.27)  will  correspond 
to  the  nearest  optimal  solution.  However,  a  verification  of  this  conjecture 
is  not  yet  available. 

Note  that  from  (3.50),  AJ^  will  in  fact  tend  to  zero  on  a  variety 
of  the  parameter  space.  Since  we  are  only  interested  in  this  result  when 
conditions  like  (3.27)  fial,  in  some  cases  it  may  happen  that  this  variety 
will  have  high  probability  of  occurrence  on  the  subset  of  the  parameter  space 
where  (3.27)  fails.  However,  it  appears  "generically"  that  AJ^  tends  to  a 
nonzero  limit  for  Stackelberg  strategy  with  very  large  threat. 

It  should  be  noted  also  that  the  conclusion  drawn  here  is  for  u^ 
as  a  continuous  function  of  u^.  If  u^  is  allowed  to  be  discontinuous,  AJ^ 
will  in  fact  be  zero  for  finite  threats  that  are  large  enough. 

3.3.3.  Examples 

We  use  the  example  in  Section  3.2.3  to  illustrate  the  effect  of 
infinite  and  finite  threats.  It  is  shown  that  if  each  component  of  the 
threat  tends  to  infinity  at  equal  rate  when  the  leader  announces  his  strategy, 
then  the  team  solution  is  indeed  enforced.  However,  if  the  threat 
coefficients  tend  to  infinity  (at  equal  rate)  in  the  leader's  cost,  the 
limiting  cost  is  shown  to  be  higher  than  the  team  cost. 

From  Secion  3.2.3,  we  have  the  following  representation  for  u^(0) 

and  u^(l) 


E 


V0)  “  fl(u2(0)-7  V-T  Xo 

ul(1)  *  f2(u2(0)‘7  Xq)  +  f3(u2(1)  +  7  Xo)_U  Xo 


where  are  coefficients  in  the  threat  matrix.  Then, 

x(l)  «yxo+f1(u2(0)-|xo)-u2(0) 

x(2)  -  yy  XQ+2f1(u2(0)  -y  xq)  +  f2(u2(0)  -y  xq)  +  f3(u2(l)  + 

-  2u2(0) +  u2(l) . 


u  (0)  -  7  x  +  Ag  x 
l  7  O  O  o 


u2(1)  "  ‘  7  Xo  +  AglV 


Ul(0)  -  <f1Ag0-y)x(. 


“]_(!)  “  (f2A8o  +  f3Agl‘l4)xo- 


When  ten<*  to  +  0°  at  equal  rate,  asymptotically 

2(l+b)f „f „-(l+3b-3a-3ab)f,  f 


14[ (1+b) f 3 ( (5+a) f “+(l+b) f “+4f xf 2)-f 3(2f yKl+b) f 2) *] 

-2( (1+b) f „+2f , ) (1+b) f„-2(2-3a) f , )+(b-l) ( (5+a) f?+(l+b) f ?+4f , f 


14 [ (1+b) f 3 ( (5+a) f “+(l+b) f “+4f if J -f 3 (2f x+(l+b) f 2) 


Thus ,  when  f  ^  *  f  2  ■  f  3  *  +°° 


AgQ  -  Agx  -  0 


and  the  team  solution  is  enforced  by  the  leader. 


If  we  retain  f. ,f.,f.  and  substitute  the  expressions  into  J 


»h| 


J1  •  J1  +  AJ1 


where  J,  is  the  team  cost  and  AJ,  is  the  deviation  due  to  Ag  and  Ag, . 

1  1  o  1 

As  ®  at  equal  rate, 

AJl  ~  (10fJ+8f1f2+4f2)Ag^+4f3Ag^+  (8^+8^^^. 

From  the  above,  we  know  AgQ,Ag^~  0(-^-)  .  But  the  quadratic-f^  coefficients 
make  each  term  tending  to  a  finite  limit.  If 


limf  /f  -  1 

f,-~  1  J 


Vi.jS  {1,2,3} 


<y+  (f1-l)Ago)xo 


(i-  24go+  4g1  +  2f1Ago+  £24g<)+  f3Ag1)xc 


Substitution  of  these  expressions  into  J2  and  minimization  with  respect  to 
u2(0)  and  u2(l)  render 

(6c-4)+(4-6a)f1-(l+b)f,  2  - 

- - -  +  (  (5+a)  f  J  +  (  (1+b)  f  2  +  4f  xf  2-10f  x-4f  2  +  5  +  c) Ag 


+  (2f1f3+  (l+b)f2f3-2f3+2f1+f2-2)Ag1  -  0 


(l-b)f3+(l-2d) 


+  ( (1+b) f 2f 3  +  2f xf 3  +  2f x +  f 2-2f 3-2) Agc 


+  ((l+b)f3+2f3+ (l+d))Agl  -  0. 


When  f^,f2>f3  are  chosen  to  satisfy  the  sufficiency  conditions  derived  in 


Section  3.3.2,  namely. 


(6c-4) +  (4-6a)f1*(l+b)f2  -  0 


(l-b)f„+  (l-2d)  -  0, 


and  the  team  solution  is  enforced 


lim  AJ  -  - - - x  [22(4b+6a+6ab)2  +  4(8-b2-37a-3b+ab)2 

f^-**  ( l+a+5b+ab) 1 

+ 15 (8-b2-37a-3b+ab) (4b+ba+6ab) ] . 

Thus,  we  conclude  that  the  team  cost  cannot  be  approached  arbitrarily  close 
with  large  threats. 

It  has  been  mentioned  in  the  previous  section  that  we  should 

examine  the  continuity  of  only  for  the  cases  the  sufficiency  conditions 

2 

in  Section  3.2  fail.  In  this  example,  they  occur  at  a»j  or  b“l.  For  b»l 
AJ  - - [352(l+3a)2  +  64(l-9a)2  +  256(l-9a)(l+3a)]. 
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3.4  Decentralized  and  Stochastic  RSP 
3.4.1.  Introduction 

The  cases  we  shall  examine  here  are  the  deterministic  decentralized 
and  stochastic  state  feedback  information  structures.  Due  to  the  difficulty 
of  the  general  decentralized  team  problem  (as  will  be  explained  later),  the 
team  problem  is  solved  under  the  restriction  of  linear  strategies.  Sufficient 
conditions  similar  to  those  obtained  in  section  3.2  can  then  be  stated,  but, 
as  to  be  expected,  they  become  slightly  more  stringent.  In  the  stochastic 
case,  the  problem  is  not  solved  in  general;  it  is  only  after  some  further 
restrictions  are  imposed  on  the  information  structure  that  non-void  conditions 
for  RSP  can  be  obtained. 

Intuitively,  RSP  can  be  solved  in  two  ways.  One  is  to  use  the 
infinite  threat  concept  discussed  in  section  3.3.  The  other  is  to  alter  the 
follower's  objective  function  so  that  the  optimal  follower  strategy  coincides 
with  the  team  strategy.  The  former  method  meets  with  difficulties  in  both 
decentralized  and  noisy  state  information  cases.  In  the  first  case,  the  leader 
can  only  enforce  the  team  trajectory  projected  onto  his  observation  space, 
which  in  general  does  not  imply  that  his  team  cost  is  attained.  In  the  second 
case,  the  leader  is  unable  to  implement  the  threat  term  (that  vanishes  upon  the 
enforcement  of  the  team  solution)  due  to  the  random  nature  of  the  state  trajec¬ 
tory.  The  latter  method,  however,  can  be  applied  to  the  deterministic  decen¬ 
tralized  RSP  provided  linear  representations  of  the  strategies  are  constrained. 
But  the  method  still  fails  in  the  stochastic  case.  Therefore,  we  allow  the 
leader  to  have  access  to  the  follower's  past  control,  the  problem  then  reduces 
to  the  same  framework  as  before. 
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3.4.2.  Decentralized  RSP 

We  approach  RSP  under  information  structure  (d)  in  the  same  manner 
as  in  section  3.2.  However,  now  the  team  solution  is  not  as  easily  obtainable 
as  in  the  previous  case.  It  is  known  that  the  general  team  solution  under  the 
decentralized  information  pattern  is  in  general  non-linear,  and,  in  fact,  the 
problem  is  not  always  analytically  solvable  ([15],  [16]).  The  difficulty  lies 
in  the  dual  role  of  the  control  variables,  namely,  control  and  estimation. 
Specifically,  if  the  projection  approach  used  in  section  3.2  is  used  here,  the 
projections  of  the  state  variables  onto  the  observation  space  cannot  be  evalu¬ 
ated  since  the  distribution  of  the  states  are  affected  by  the  past  controls 
which  in  turn  depend  on  the  projection  of  the  states.  Therefore,  here  we  con¬ 
strain  the  strategies  to  be  linear  in  observation.  The  optimization  of 
leader's  performance  is  carried  out  under  this  constraint  by  using  the  para¬ 
meter  optimization  technique  (the  discrete-time  and  finite-horizon  equivalent 
of  the  continuous  time  approach  in  [ 17] ) .  The  leader  then  tries  to  enforce 
this  solution.  Once  linearity  is  assumed,  sufficient  conditions  for  RSP  solu¬ 
tions  can  be  derived  in  the  same  way  as  in  section  3.2,  but,  as  expected,  these 
conditions  are  more  stringent  than  the  centralized,  deterministic  counterparts. 
Team  Solution  Under  Linear  Representation  Constraint 

In  this  section,  we  use  the  parameter  optimization  technique  to  obtain 
the  best  linear  solution  for  the  leader's  team  problem.  The  decentralized  out¬ 
puts  are  assumed  to  be  linear  functions  of  the  past  states,  i.e., 


1 1 


\ ! 

5  $ 


,  „ 


Yj  (k)  -  2  C..  (k)  X  (j)  €  Jr* 

1  j-0 


(3.51) 


is  che  observation  vector  for  player  i  at  stage  k.  The  states  may  be  noise- 


corrupted. 


Convert  the  observations  to  static  form  in  the  usual  way. 
Yi  *  CiX 


(3.52) 


where  Is  a  block  lower  triangular  matrix. 

The  objective  is  to  solve 

min  JL  -  E  IX^  X  +  U]/RHU1  +  U2/r12U2^ 


(3.53) 


such  that 


i  -  1,2 


is  a  block  lower  triangular  matrix. 

The  following  proposition  states  the  sufficient  conditions  for  the  above 
problem. 

Proposition  3.6 


If  there  exists  a  unique  quadruplet  (G^,  G2>  P>  ^)  satisfying 
(I-Hjl  Gx  Cl  -  HjG^)  P  (I-HjGjCj^  -  -  Dl^' 


(Q1  +  Cl  G1  R11G1C1  +  C2  G2  R12G2C2) 


(3.54) 


+  (I-HjGjCj^  -  HjG^)'  A  (i-h^Cj.  -  HjG^)  -  0  (3.55) 


utL  l  CLP  (C1G1/R11  -  (I-HjGjC,.  -  H^C^'A  Hl>  *  "  0 


(3.56) 


0 


(3 


ut2  l  C2P  (C2  G2  R12  -  (I-H1G1C1  -  A  H2)  } 


(ut^  (•)  A  block  upper  triangular  portion  of  the  matrix  with  block 
dimension  x  )  (3 


»12  +  «2  A  **2 


A  IL.  >  0 


Ru  +  Ht  Ahi>0 


P  >  0 

(where  £0  -  E  [XqXo '] ) 


G1  Y1 


U„  *  G„  Y„ 
2  2  2 


solve  the  problem  (3.53) 


Proof:  See  Appendix  II. 


Q  •  E  .D. 


Conditions  for  Enforcing  the  Team  Solution 


We  now  derive  the  sufficient  conditions  for  the  leader  to  enforce 
his  best  linear  decentralized  team  solution.  The  development  is  similar  to 
the  centralized  case,  in  fact,  some  of  Che  previous  results  are  directly 
applicable  here. 


Theorem  3.7 


Assume 
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i 


(i)  V  (z^)  is  a  causal,  differentiable,  class-T  function 

(ii)  J2  (  Y  (zp>  u2)  is  convex  over  the  convex  set  U2  ■  {u^  u2(k)  measurable 
with  respect  to  {y2  (0),  .  .  .  ,  y2  (k)33 

(iii)  2  F  3 
F  solves 

[(R22  G2  C2  +  VV  +  p/(R21  G1  C1  +  Hl/Q2)]  (I_H1  G1  C1  '  H2  G2  "  0 

and 

F  -  vv  Y(y,)  I  c  (H  F  +  H_) 
yl  t 

then  Y1  *  Y1 

yl  "  Y^2i^  wil1  force  u2  to  play  u2 

proof: 

Assumptions  (i),  (ii)  guarantee  that  the  minimization  of  J 2  (  /(Z^),^) 
results  in  u2  *  u*  ,  then  (u*,  u2)  is  a  global  minimum  of  J2  (  Y(Zj),u2). 
Therefore,  it  suffices  to  show  that  P_[v  J-] I  ■  0  implies  condition  (iii). 


P2  l7„2  J2>  ’  Pt(7u2  X>,Q2  X  +  <7u2  y>'  R21  Y  +  R22  “21 
7a2  X-Hj+Ei7  Y 


\  *  ■7ylY(yl)  7u2yl 


’  V  1  (yl>  Cl  »2  +  H1  7u„  1  > 


(3.65)  becomes 


(3.65) 

(3.66) 


(3.67) 


pt(Vu2  ■'  (y^)'  (Hj/Qj  x  +  R21  Y  (yL))+  #Q2  X  +  R22  u2J  -  0  (3.68) 

When  u2  -  G2  y2  ■  G2  C2  X 
Y  (yx)  -  gl  cx  X 


•  l.  r  
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(3.68)  becomes 

pt[(7u  viy^)'  l  (h;q2  +  r21g1c1)  +  h^q2  +  R22g2c2]  xfc5  -  o 
2 

-  (I-H^jCj  -  H^Cjf1  D  Xo 

It  is  sufficient  that 

[<VU  V(y1))'  I  (H|o2  +  r21g1c1)  +  h^q2  +  R22g2c2]  (I-H1G1C1  -  h2c2)_1  D  -  0 

t  (3.69) 

yl  "yl 

Let  Vu  7(yx)  1  -  F 


Substitute  in  (3.67)  and  (3.69),  result  follows. 

It  is  seen  that  the  result  is  almost  identical  to  that  of  the  full  state 
information  case  Therefore,  all  the  qualitative  discussion  pertaining  to 
that  case  carries  over  here.  The  specialization  to  linear  strategy  is  straight 
forward,  therefore  is  omitted  here. 

3.43  St-'chastic  RSP: 

As  mentioned  in  the  introduction,  stochastic  RSP  with  state  infor¬ 
mation  only  is  not  solvable  due  to  the  randomness  of  the  team  trajectory. 

The  only  information  structure  that  can  be  shown  solvable  under  the  stochastic 
setting  is  the  one  including  the  perfect  knowledge  of  follower's  action.  This 
assumption  exactly  bypasses  the  difficulty  since  u2  is  known  and  can  be  used 
to  check  against  u“.  Once  this  assumption  is  made,  the  derivation  becomes 
almost  identical  as  in  section  3.2. 

For  LQ  setting  given  state  information,  the  separation  theorem 
holds,  therefore,  the  team  optimal  solutions  are  as  in  (3.9) 
ui  ■  Gi  X 

where  G^  is  block  diagonal. 


We  can  then  immediately  state  the  sufficient  conditions. 
Theorem  3.8 


Given  information  structure  (e) 

Assume 

(i)  (Z^)  is  a  causal,  differentiable,  class-T  function 

(ii)  J2  (y(Z1),  u2)  is  convex  over  the  convex  set  U2  ■  C u2 1  u2  (k)  measurable 
with  respect  to  {  X  (0),  .  .  .  ,  X(k)3 

(iii)  3  f3 

[F'(H^Q2  +R21G1)  +  (H^Q2  +R22G2)J  (I-H1G1  -  H^)”1  [D  I]  -  0  (3.70) 

and 

Y(X,  u2)|  (H2  +  Hx  F)  +7u(2)  v(X.  u2)I  -  F  (3.71) 

Then 

u^  •  Y  (Z^)  will  force  u2  to  adopt  u* 


Proof: 


Using  the  proof  of  Theorem  3.2,  we  get 


t(Vu  Y (X,  u2)|  )'(Hj  Q2  +  R21Gx)  +  (H^Q2  +  R22G2)]  X  «  0 

2  t 

U2  "  U2 

X  -  xfc 

X1  *  D  X  +  H.  G,  Xfc  +  H-G.  X*  +  W 
o  11  2  2 

-1 


XC  -  (I-«1G1  -  H^f1  [D  I]  X( 


W 


It  is  sufficient  that 


-lr 


[(vu  y (x,  u2)|  )'(hJq2  +R21gl)  +  (h^q2  +  R22g2)]  (i-h1g1-h2<^)  [D  I]  -  0 

2  t 

(3.72) 


t 
*2 

X  -  Xc 


U2  *  U2 


■A— f.  _ ■- _ , _ J  A _ *fc  .  *.  .? _ »_  f  ft _ t  _  ft  *  »  •- 


V.V. 


£  c 


Vu  Y(X,  u2)  satisfies  (from  (3.21)) 


V  (X,  u2)  !  (Hj  +  Y  (X,  u2)  |  )  7u^2)  Y  (X,  u2)  !  -  7  Y  (X,  u2) 


X  ■  X 


U2  “  °2 


X  -  X 


U2  "  U2 


X  «  X  u„ 


u.  *  u„ 
2  2 


X  -  X 


u2  -  u2 


Let  F  -  Vu  Y  (X,  u2)l  ,  we  have  the  stated  result 

X  -  Xt 
t 

u.  ■  u 
2  2 


N  (N  +  1) 

In  (3.70),  we  have  — 2  '  x  m2  unknowns  in  F  (F  is  causal), 

4n 

and  N  (2n  x  m.)  equations.  Then  if  N  i  -  1,  F  is  generically  solvable. 

i. 

The  stochastic  RSP  is  still  an  open  problem.  Even  though  we  know 
that  under  the  state  information  solution,  solution  does  not  exist.  It  will 
be  of  a  great  deal  of  interest  to  see  how  near-optimal  is  the  parameter 
optimization  approach.  The  near-optimality  of  some  intuitive  method,  such 
as  the  use  of  best  team  state  trajectory  estimate  or  just  the  plain  certainty 
equivalence,  should  also  be  investigated. 


m  ••••' 


4.  CONCLUSION 


In  this  report,  we  have  studied  the  application  of  dynamlc-to-static 
conversion  technique  to  the  Restricted  Stackelberg  Problem.  RSP  is  a 
restricted  version  of  the  Stackelberg  equilibrium  solution  concept.  It  is 
an  important  modeling  tool  for  the  economic  systems  and  large  scale  engineering 
systems . 

The  definitions  of  the  team,  Stackelberg,  and  restricted  Stackelberg 
problems  are  first  stated.  We  then  give  the  precise  statement  of  the  problem 
under  consideration  and  introduce  the  conversion  technique  which  is  the 
backbone  of  this  analysis.  The  past  work  and  results  are  briefly  summarized 
and  the  contributions  of  this  report  and  pointed  out  to  close  off  Chapter  2. 

The  main  results  and  discussions  are  presented  in  Chapter  3.  RSP 
under  five  different  information  structures  is  considered.  Three  of  the 
information  structures  are  centralized,  deterministic,  the  others  are 
deterministic  decentralized  and  stochastic.  The  deterministic  centralized 
Information  patterns  illustrate  how  RSP  is  approached  and  permits  the 
examination  of  various  qualitative  aspects  of  its  solution.  They  also  show 
how  the  restriction  on  the  information  structure  affects  the  solvability  of 
RSP.  The  decentralized  information  pattern  encounters  a  particular  difficulty 
with  regard  to  RSP,  viz.,  in  the  solution  of  the  corresponding  team  problem. 
Since  the  team  solution  is  difficult  to  obtain  in  general,  we  settle  for 
a  suboptimal  result,  the  best  linear  team  solution.  The  sufficient  conditions 
for  the  leader  to  enforce  this  solution  are  then  derived.  The  stochastic 
information  patterns  create  another  difficulty  in  RSP,  the  inability  to 


formulate  the  threat  In  the  presence  of  random  noise.  This  problem  is 
bypassed  by  assuming  the  follower's  past  controls  to  be  available  to  the 
leader . 

The  centralized  cases  are  studied  in  detail.  Sufficient  conditions 
for  RSP  solutions  are  derived  and  how  they  are  affected  by  the  information 
structure  of  the  leader  is  discussed.  We  then  examine  the  stationarity  and 
the  convexity  conditions  for  the  general  nonlinear  representation  of  the 
leader's  strategy  to  ensure  that  the  team  solution  is  indeed  also  the  follower's 
optimal  operating  point.  The  result  restricted  to  linear  representations  of 
the  leader's  strategy  is  then  presented,  motivated  by  the  observation  that 
nonlinearity  does  not  add  any  significant  advantage  and  poses  difficulty  in 
the  convexity  condition.  An  example  is  also  presented  to  verify  the  derived 
results . 

Noting  that  the  sufficient  conditions  are  not  always  satisfied,  a 
natural  query  arises:  in  the  case  the  stated  conditions  fail  to  be  satisfied, 
can  the  leader  attain  a  cost  arbitrarily  close  to  the  team  cost  by  choosing 
a  threat  as  large  as  he  desires  (but  finite)?  To  address  this  question  it 
is  found  that  if  the  threat  is  infinite,  RSP  is  solved  (under  some  mild 
conditions).  However,  if  the  threat  is  large  but  finite  (no  matter  how  large), 
in  general  there  is  always  an  offset,  bounded  away  from  zero,  in  his  cost  from 
the  team  cost.  It  should  be  noted  that  this  assertion  is  not  true  if 
discontinuous  strategies  are  allowed.  This  result,  though  reduces  the  hope 
of  a  continuous,  guaranteed  near  optimal  solution,  does  offer  a  design 
alternative  if  the  offset  is  not  very  large.  An  example  is  also  presented 
to  verify  the  above  result. 


Strictly  speaking,  the  general  decentralized  RSP  is  not  solved.  The 
problem  lies  in  the  fact  that  the  corresponding  team  solution  is  not  solved 
in  general.  However,  if  the  structure  of  strategies  is  restricted  to  linear, 
then,  by  using  parameter  optimization,  the  team  problem  can  be  solved.  RSP  is 
solved  the  same  way  as  the  centralized  case  once  the  linearity  assumption  is 
adopted.  The  stochastic  case  with  state  information  is  not  solved  (and  seems 
unsolvable  in  its  full  generality)  due  to  the  lack  of  redundant  information  to 
implement  the  threat  (the  state  trajectory  corresponds  to  a  sample  path  of 
a  random  process) .  To  bypass  the  problem,  we  allow  the  leader  to  have  access 
to  the  follower's  past  controls.  The  problem  then  reduces  to  the  deterministic 
case.  The  stochastic  decentralized  RSP  with  the  leader  having  the  follower’s 
past  controls,  though  not  presented,  can  be  tackled  in  the  same  manner  as  the 
combination  of  the  above  two  problems.  However,  the  linear  representation 
constraint  again  has  to  be  used.  Suboptimal  results  may  be  obtained  via 
parameter  optimization,  but  are  not  pursued  in  this  report. 

The  static  conversion  has  proved  invaluable  in  simplifying  the 
conditions  and  the  analysis  of  RSP.  There  certainly  remain  a  great  deal  of 
open  questions,  even  for  this  special  type  of  problem.  The  suboptimal 
strategies  need  to  be  investigated  in  the  deterministic  case  when  the  derived 
conditions  are  not  satisfied,  and  in  the  stochastic  case  when  the  information 
is  restricted  to  the  past  states  only.  The  hierarchical  result  also  needs 
to  be  developed  (it  will  be  an  easy  extension  of  the  results  stated  here) 
because  of  the  unique  feature  of  RSP  that  the  follower  is  under  no  protection 
from  the  leader's  manipulation.  The  conditions  for  RSP  solutions  should  be 
interpreted  from  a  qualitative,  perhaps  geometric,  point  of  view.  Specific 
applications  should  also  be  investigated  to  demonstrate  that  RSP  is  not  merely 
a  theoretical  pastime  but  has  definite  practical  value. 
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APPENDIX  I 


PROOF  OF  PROPOSITION  3.1 


We  prove  it  by  induction 

At  stage  N-l,  calculate  u^N-1)  in  terms  of  x(N-l) 

u± (N-l)  -  -R^J(N-l)B^(N-l)Q1(N)x(N) 

-  -R"*(N-1)B’ (N-DQ^N)  (A(N-l)x(N-l)  +  B^N-Du^N-l) 
+  Bp (N-l) u2 (N-l) ) 


i-4-Lbbi(n-i>  -4“,NB2<N-i)]rui(H-i>i  r^-i.N 


A(N-l)x(N-l) 


By  assumption. 


u^N-1)  |  |  g1(N-l) 


u2 (N-l)  I  I  g2(N-l) 


A(N-l)x(N-l) . 


Assume  similar  procedure  can  be  carried  out  to  obtain 


u^Cj)  ■  gi(j)x(j)  for  j  »k+l,...,N-l. 


Then, 


x(i)  -  it  (A(i)  +  B.  (i)g  (i)  +B-(i)g9(i))x(k+l)) 
i-k+1  11  2  2 

u .  (k)  -  -Z  RThwBjCW^’U-l.kJQ^Dxa) 


i-k+1 


'li'~'  i' 


-  [  -?  Rr}(k)B !  <k)« '  (i-l,k)Q.  (i)  V  (A(i) 


i-k+1 


li  i 


(1  ' 
1  i-k+1 


+  B1(i)g1(i)  +  B2(i)g2(i)) ]x(k+l) 


APPENDIX  II 


PROOF  OF  PROPOSITION  3.6 


Substitute  u^«GjY^  into  (3-53) 

J±  •  E{x,(Q1  +  C^R11G1C1+C2G2R12G2C2)x} 

-  trUQ^C^R^^+CjGjRj.jGjCj^Ix  x’J}. 
Substitute  u^  ■  G^Y^  into  the  state  equation,  then 
(I-H1G1C1-H2G2C2)x  -  Dxq. 

Therefore, 


(I-H^C^GjCj) Etx  xMd-^G^-HjGjCj)*  -  DE^x^d' 

where 

DE[x  x']D'  -  DI  D* 
o  o  o 

is  assumed  known. 

Let  P*E[x  x']  and  use  matrix  Lagrange  multiplier,  we  have 
converted  the  problem  to  one  that  chooses  G^,  G2>  P,  A  to  minimize 

L(G1,G2,P,A)  -  trKQj^  +  0^^0^  +  020^20202)? 

+  A ( (I-HjGjCj-HjGjC^P (I-H1G1C1-H2G2C2)  ’-D^D’)  ] . 
Set 


dL 


Sf  <GrG2*P.A  +  eAA)|e-0  -  0 


VM6R(lffl)nX(N+1)n 


we  get 


Set 


(I-H^C^G^JPd-H^C^H^Cj)’  -  MoD'. 


£  ^  »+rtp.*>|  -  0  vS0SR«+1)''-w+1>t‘ 


(3.54) 


,v 
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I 


5? 


& 


N 


R 


s 


k-J 


h? 

& 


we  get 


(Q1  +  C1G1R11G1C1  +  C2G2R12G2C2)  +  (I-H1G1C1_H2G2C2)  'A<I-H1G1C1-H2G2C2)  “  0 

(3.55) 


Set 


Tp1  (G^+  eAG^jG^t? tA) | £=q  *  0  VAG^  lower  block  triangular  R 


Nm^xNr^ 


matrices  with  each  block  of 


dimension  m^Xr^ 


trCCjPCC'GjR^  (I-H^Cj-H^Cj)  'AHjAG^  -  0. 


Let 


K  -  CjPCC^Rn-  (I-H^C^HjGjCj)  '  ^ 


kll  klN 


^  kNN. 


where  k^  *  r^xm^  block 


AG, 


AG 


11 


Lagni 


AG, 


NN 


AG^j  *  mixri  block 


r  AGni 

GN-1,N-1 

[ku  ...  k1N] 

• 

« 

• 

+  *••  +  ^-l.N-l^-l.N1 

„agni. 

m.  xr. 

.  *».*-!  , 

-  0 

VAG^j  €  R  1  1 

[kn 

klH1 

-  0 

kNNAGNN 


^k»-l,N-lk»-l,H^ 
CkNN]  *  °’ 


iJtoJ 


UJ'..*aUi«  -w'  . 


!  I 


ut^ [K]  -  0. 

Similarly  for  G2>  we  have 


ut2[C2P(C^R12-  (I-H^^-HjGjCjJ'AHj]  -  C 
For  second  order  sufficiency  conditions,  we  need 


2  L(G^+  eAG^,G2>P,A)  |  >  0 


2  L^Gl*(’2+ e«0  >  ® 


we  need 


■'  tr(C±PC^)  (AG^(R^  +  H^AHi>  AG^)  >  0 


i-  1,2. 


Sufficient  conditions  are 


P  >  0 


R11  +  H^AH1  >  0 


VAG^  e  lower  block  triangular 


i-  1,2. 


Q.E.D. 
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